8 research outputs found

    Adaptations of neutrality tests

    Get PDF
    Most of the genetic variation observed within a biological species is generally thought to be evolutionary “neutral” in the sense that it is irrelevant for an individuum whether its genome contains one particular variant or another. Evolutionary biologists, and in the case of the human species anthropologists and medical scientists as well, are by contrast interested in variants which do influence on an individual’s survival and/or its ability to reproduce. Population geneticists try to find such variants by purely statistical methods in the form of tests on neutrality or shortly neutrality tests. In this thesis four publications are reprinted and discussed which are concerned with modifications of existing neutrality tests. Three of them deal with a class of tests relying on the so-called site frequency spectrum. It was shown previously that some of these tests, originally designed on models of constant population size, can be adapted to allow for changes in population size. This is generalized in the first publication to all tests of similar structure. Another aspect of these tests is that they are ignorant with respect to which variant in a sample might evolve non-neutrally. If instead a particular variant is suspected a priori, the tests have to allow for this information by conditioning on the existence of a variant with the observed frequency. The second and third article introduce the concept of a conditional frequency spectrum and derive its first resp. second moments which are necessary for an appropriate extension of the above-mentioned class of tests. The fourth article presents an algorithmic improvement of a neutrality test of a different kind. Here, primarily computational speed was of concern, in order to bear comparison with competing software. Solely applications on human data are presented, which is available in unrivalled abundance, owing to several large-scale genotyping and sequencing projects. The applicability of neutrality tests, however, is not confined to any particular species

    The expected neutral frequency spectrum of linked sites

    Full text link
    We present an exact, closed expression for the expected neutral Site Frequency Spectrum for two neutral sites, 2-SFS, without recombination. This spectrum is the immediate extension of the well known single site θ/f\theta/f neutral SFS. Similar formulae are also provided for the case of the expected SFS of sites that are linked to a focal neutral mutation of known frequency. Formulae for finite samples are obtained by coalescent methods and remarkably simple expressions are derived for the SFS of a large population, which are also solutions of the multi-allelic Kolmogorov equations. Besides the general interest of these new spectra, they relate to interesting biological cases such as structural variants and introgressions. As an example, we present the expected neutral frequency spectrum of regions with a chromosomal inversion.Comment: 26 pages, 5 figure

    Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data

    No full text
    International audienceAnalysis of population genetic data often includes a search for genomic regions with signs of recent positive selection. One of such approaches involves the concept of extended haplotype homozygosity (EHH) and its associated statistics. These statistics typically require phased haplotypes, and some of them necessitate polarized variants. Here, we unify and extend previously proposed modifications to loosen these requirements. We compare the modified versions with the original ones by measuring the false discovery rate in simulated whole-genome scans and by quantifying the overlap of inferred candidate regions in empirical data. We find that phasing information is indispensable for accurate estimation of withinpopulation statistics (for all but very large samples) and of cross-population statistics for small samples. Ancestry information, in contrast, is of lesser importance for both types of statistic. Our publicly available R package rehh incorporates the modified statistics presented here

    Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data

    No full text
    International audienceAnalysis of population genetic data often includes a search for genomic regions with signs of recent positive selection. One of such approaches involves the concept of extended haplotype homozygosity (EHH) and its associated statistics. These statistics typically require phased haplotypes, and some of them necessitate polarized variants. Here, we unify and extend previously proposed modifications to loosen these requirements. We compare the modified versions with the original ones by measuring the false discovery rate in simulated whole-genome scans and by quantifying the overlap of inferred candidate regions in empirical data. We find that phasing information is indispensable for accurate estimation of withinpopulation statistics (for all but very large samples) and of cross-population statistics for small samples. Ancestry information, in contrast, is of lesser importance for both types of statistic. Our publicly available R package rehh incorporates the modified statistics presented here

    The neutral frequency spectrum of linked sites

    No full text
    We introduce the conditional Site Frequency Spectrum (SFS) for a genomic region linked to a focal mutation of known frequency. An exact expression for its expected value is provided for the neutral model without recombination. Its relation with the expected SFS for two sites, 2-SFS, is discussed. These spectra derive from the coalescent approach of Fu (1995) for finite samples, which is reviewed. Remarkably simple expressions are obtained for the linked SFS of a large population, which are also solutions of the multi-allelic Kolmogorov equations. These formulae are the immediate extensions of the well known single site theta/f neutral SFS. Besides the general interest in these spectra, they relate to relevant biological cases, such as structural variants and introgressions. As an application, a recipe to adapt Tajima's D and other SFS-based neutrality tests to a non-recombining region containing a neutral marker is presented. (C) 2018 Elsevier Inc. All rights reserved

    OntoELAN: An Ontology-based Linguistic Multimedia Annotator

    Full text link
    Despite its scientific, political, and practical value, comprehensive information about human languages, in all their variety and complexity, is not readily obtainable and searchable. One reason is that many language data are collected as audio and video recordings which imposes a challenge to document indexing and retrieval. Annotation of multimedia data provides an opportunity for making the semantics explicit and facilitates the searching of multimedia documents. We have developed OntoELAN, an ontology-based linguistic multimedia annotator that features: (1) support for loading and displaying ontologies specified in OWL; (2) creation of a language profile, which allows a user to choose a subset of terms from an ontology and conveniently rename them if needed; (3) creation of ontological tiers, which can be annotated with profile terms and, therefore, corresponding ontological terms; and (4) saving annotations in the XML format as Multimedia Ontology class instances and, linked to them, class instances of other ontologies used in ontological tiers. To our best knowledge, OntoELAN is the first audio/video annotation tool in linguistic domain that provides support for ontology-based annotation.Comment: Appeared in the Proceedings of the IEEE Sixth International Symposium on Multimedia Software Engineering (IEEE-MSE'04), pp. 329-336, Miami, FL, USA, December, 200
    corecore